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ABSTRACT 

The goal of this study was to investigate the nature of online comprehen- 
sion monitoring, its predictors, and its relation to reading comprehension. 
Questions were concerned with (a) beginning readers’ sensitivity to incon- 
sistencies, (b) predictors of online comprehension monitoring, and (c) the 
relation of online comprehension monitoring to reading comprehension 
over and above word reading and listening comprehension. Using eye 
tracking technology, online comprehension monitoring was measured as 
the amount of time spent rereading target implausible words and looking 
back at surrounding contexts. Results from 319 second graders revealed 
that children spent greater time fixating on inconsistent than consistent 
words and engaged in more frequent lookbacks. Comprehension monitor- 
ing was explained by both word reading and listening comprehension. 
However, comprehension monitoring did not uniquely predict reading 
comprehension after accounting for word reading and listening compre- 
hension. These results provide insight into the nature of comprehension 
monitoring and its role in reading comprehension for beginning readers. 


Reading comprehension involves complex processes, including many language, cognitive, and print- 
related skills. According to the simple view of reading (Hoover & Gough, 1990), this complexity can 
be captured by two essential skills, language comprehension and word reading (Catts, Adlof, & Ellis 
Weismer, 2006; Joshi, Tao, Aaron, & Quiroz, 2012; Kim, 2011; see Florit & Cain, 2011, for a review). 
Word reading involves a complex set of skills such as phonological, orthographic, and semantic 
processing (Adams, 1990; Carlisle, 2004; Kim, Apel, & Al Otaiba, 2013; Nagy, Berninger, Abbott, 
Vaughan, & Vermeulen, 2003; Schatschneider, Fletcher, Francis, Carlson, & Foorman, 2004), and 
comprehension involves an even more complex set of language and cognitive skills (e.g., Cain, 
Oakhill, & Bryant, 2004; Kendeou, Bohn-Gettler, White, & Van Den Broek, 2008, 2015, 2016, 2017; 
Perfetti, Landi, & Oakhill, 2007). 

Comprehension monitoring is one such higher order cognitive process involved in comprehen- 
sion. It refers to evaluating, regulating, and reflecting on constructed meaning (Baker, 1985; Oakhill 
& Cain, 2012) and recognizing comprehension failures (Wagoner, 1983). Successful comprehension 
requires establishing an accurate situation model, which, in turn, relies on a series of construction 
and integration processes (Graesser, Singer, & Trabasso, 1994; Kintsch, 1988; Zwaan & Radvansky, 
1998). Comprehenders first have to construct initial, elementary propositions based on textual 
information and then update and integrate propositions across the text and with one’s background 
knowledge. Comprehension monitoring is hypothesized to play a critical role during the integration 
process, as initial local propositions have to be evaluated for accuracy and consistency and then have 
to be corrected via integration processes to establish a coherent and accurate situation model (Kim, 
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2015, 2016). In other words, comprehension monitoring contributes to the evaluation and repair 
processes during comprehension, and, therefore, lack of comprehension monitoring would lead to 
an inaccurate situation model. 

Comprehension monitoring has been extensively investigated using an inconsistency detection 
framework (ie., the ability to detect violation of consistency in a sentence or text; Baker, 1984; Beal, 
1990; Cain et al., 2004; Kim & Phillips, 2014) and has been shown to be important to text 
comprehension. Much previous work on comprehension monitoring has been conducted in the 
context of reading (e.g., Cain & Oakhill, 2006; Cain et al., 2004; Oakhill & Cain, 2012; Vosniadou, 
Pearson, & Rogers, 1988; Yuill & Oakhill, 1991). Another line of work also indicates (a) that there is 
large variation in comprehension monitoring in oral language contexts, even for young children 
(Baker, 1984; Kim, 2015, 2016, 2017; Kim & Phillips, 2014; Revelle, Wellman, & Karabenick, 1985; 
Skarakis-Doyle, 2002), and (b) that individual differences in comprehension monitoring are related 
to comprehension of oral texts (i.e., listening comprehension; Kim, 2016, 2017; Kim & Phillips, 
2014). Furthermore, an intervention improved comprehension monitoring in an oral language 
context for prekindergartners from low socio-economic backgrounds (Kim & Phillips, 2016). 

In the present study our goal was to expand our understanding of comprehension monitoring 
and its role in reading by examining online comprehension monitoring for beginning readers (i.e., 
second graders). The aforementioned previous work on comprehension monitoring used tasks that 
demanded a deliberate effort to find anomalies in text by asking children whether they notice any 
inconsistencies in the read or heard sentences. In contrast, we used eye tracking methodology to 
measure online comprehension monitoring because eye movements can capture moment-to- 
moment processes in a natural reading context, without drawing the child’s attention to the existence 
of inconsistencies in the text. 

Eye movement research in the last four decades has revealed a great deal of information about 
online processes during reading (see Radach & Kennedy, 2004; Rayner, 2009; Rayner & Kliegl, 2012; 
for recent reviews). A number of parameters are available as indicators of underlying processes. For 
example, first-pass reading time, or the summed duration of all fixations within a word or region, 
before the first saccade leaves that region, is thought of as an indicator of lexical access or decoding 
(Rayner, 1998). On the other hand, rereading time, or the time spent on a word after first-pass 
reading, reflects higher order processes such as syntactic integration and construction of a situational 
model (Raney, Campbell, & Bovee, 2014; Rayner, Pollatsek, Ashby, & Clifton, 2012). In principle, 
longer rereading times can be a result of either more regressions or longer fixation times following 
(the equivalent amount of) regressions. Bicknell and Levy (2011) suggest that higher level linguistic 
processing causes the reader to go back and reread previous text in order to correct a failure in or 
decreased confidence in comprehension. Eye-tracking technology is an excellent tool to capture 
online comprehension monitoring in normal reading conditions without invoking any possibility 
about the presence of inconsistencies in the given text, which was the case in previous studies where 
participants were asked to identify inconsistencies (e.g., Baker, 1984; Cain et al., 2004; Kim, 2017). 
When using eye tracking technology, time spent rereading target words and surrounding regions 
would likely indicate the reader’s identification of semantic inconsistencies and possible repair 
attempts. Indeed, some previous studies focusing on sentence and discourse processing provide 
evidence that in skilled adult readers, higher order comprehension processes influence eye move- 
ments primarily when problems such as syntactic ambiguity occur (e.g., Binder, Duffy, & Rayner, 
2001; Clifton, Staub, & Rayner, 2007; Rayner, Garrod, & Perfetti, 1992), resulting in longer fixation 
durations, shorter saccades, and more regressions. 

Research with children and eye tracking has gained momentum over the last decade after some 
pioneering work in the 1990s (Blythe et al., 2006; Blythe & Joseph, 2011; Haikid, Bertram, Hyéna, & 
Niemi, 2009; Huestegge, Radach, Corbic, & Huestegge, 2009; Joseph, Nation, & Liversedge, 2013; 
Krieber et al., 2017; McConkie et al., 1991; Vorstius, Radach, Mayer, & Lonigan, 2013; Vorstius, 
Radach, & Lonigan, 2014; see also Was, Sansosti, & Morris, 2017). There is little dispute that, with 
development, readers show decreases in sentence reading times, fixation durations, number of 
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fixations, refixations, and regressions, and increases in saccade amplitudes and word skipping 
probability. However, studies that examine higher level processing and relations to off-line assess- 
ments of reading skills are still scarce. A recent study by Krieber et al. (2017) is an exception and 
showed relations between eye movements and off-line reading skills for 22 German adolescent 
readers. Another study examined fifth-grade students’ comprehension monitoring by focusing on 
the processing of conjunctive relations between clauses (Vorstius et al., 2013). In this study, students 
read sentences that contained either consistent or inconsistent information (e.g., Erica blushed 
because she was nervous. [consistent]; Erica blushed because she was confident. [inconsistent)). 
Fifth graders were sensitive to inconsistency such that they spent substantially greater amounts of 
time in the target word region of inconsistency. Moreover, readers at all skill levels were able to 
identify the inconsistencies, as evidenced by prolonged first-pass reading times. However, children 
with better reading skill showed more rereading of critical sentence regions, indicating an association 
between reading skill and comprehension monitoring. Other work with children in upper elemen- 
tary grades (i-e., fifth and sixth grades) has provided similar results when the inconsistency occurred 
in semantic relations across sentences (Connor et al., 2014) or in an extended narrative text (van der 
Schoot, Reijntes, & Van Lieshout, 2012). 


Present study 


Prior work has revealed that comprehension monitoring is an important skill that contributes to text 
comprehension (both listening and reading; Cain et al., 2004; Kim, 2015, 2016; Kim & Phillips, 2014; 
Strasser & Del Rio, 2014; Vosniadou et al., 1988). Building on these previous studies, we aimed to 
expand our understanding about comprehension monitoring and its unique role in reading com- 
prehension, using data at the beginning of second grade and eye tracking methodology. Three 
specific primary research questions guided the present study. First, are beginning readers sensitive to 
inconsistencies in written texts? The vast majority of previous work, particularly those with eye 
tracking, included children in upper elementary grades (Grades 5 or 6) who have achieved basic 
reading skills and consequently, our knowledge is extremely limited about online comprehension 
monitoring for beginning readers whose reading is largely constrained by decoding skills. An 
exception is a study with Finnish-speaking first graders which found that beginning readers in 
Finland do monitor their comprehension, and good comprehenders showed more consistent reread- 
ing and lookbacks (Kinnunen, Vauras, & Niemi, 1998). 

In the present study, we examined the amount of time spent on target implausible words (word 
N) and adjacent words (word N + 1 and word N + 2), as indicated by first-pass reading (decoding) 
and rereading (as part of comprehension monitoring). We also examined the extent to which 
children went back to the sentence that contained target words (sentence 4) and to the preceding 
sentences (sentences 1-3), after having read target implausible words (i.e., inconsistent sentences). 
These lookbacks (i.e., time spent rereading the sentence that contains target implausible words and 
the preceding sentences) are important indicators of efforts to reinspect, repair, and resolve incon- 
sistency. We anticipated substantially greater reading times on target implausible words and adjacent 
words, as well as increased lookbacks, when readers need to determine semantic inconsistency. 

If beginning readers engage in, and individual differences are found in, online comprehension 
monitoring, an important corollary is what explains variation in this behavior. This was our second 
research question. In particular, we examined whether word reading and listening comprehension are 
related to comprehension monitoring. Word reading was expected to contribute to comprehension 
monitoring because comprehension monitoring was measured in the context of written texts (i-e., reading 
context). For beginning readers, increased reading time and reinspection (or lookbacks) would reflect not 
only comprehension monitoring but also level of word reading proficiency to some extent. If this is the 
case, comprehension monitoring (as measured by increased rereading times on inconsistent words, 
lookbacks to disambiguating regions of text, and subsequent rereading) would be negatively related to 
word reading proficiency to the extent that it is influenced by word reading proficiency. Furthermore, 
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listening comprehension might contribute to comprehension monitoring after parsing out the effect of 
word reading. Mounting evidence indicates that listening comprehension, a discourse-level oral compre- 
hension skill, involves lower level oral language skills, such as vocabulary and grammatical knowledge, as 
well as higher order processes such as monitoring and inference (Florit, Roch, & Levorato, 2014; Kim, 
2015, 2016, 2017; Kim & Phillips, 2014; Lepola, Lynch, Laakkonen, Silvén, & Niemi, 2012; Tompkins, 
Guo, & Justice, 2013). In other words, although decoding skill would place a constraint on beginning 
readers’ monitoring behavior as expressed in eye movements, individual differences in listening compre- 
hension—the ability to comprehend oral language at the discourse level—may also contribute to online 
comprehension monitoring over and above word reading, even for beginning readers. 

The final question was whether online comprehension monitoring makes a unique, independent 
contribution to reading comprehension after accounting for the two powerful component skills of 
reading comprehension, namely word reading and listening comprehension, according to the simple 
view of reading (see Figure 2). By now, the essential roles of word reading and listening comprehension 
in reading comprehension are well established (Catts et al., 2006; Joshi et al., 2012; see Florit & Cain, 
2011, for a review). In fact, recent studies that measured word reading and listening comprehension as 
latent variables, with minimal measurement error, explained the vast majority of variance in reading 
comprehension (Adlof, Catts, & Little, 2006; Kim, 2017; Kim, Wagner, & Foster, 2011; Language and 
Reading Research Consortium [LARRC], 2015). Then, does comprehension monitoring, measured as 
online monitoring behavior using eye tracking methodology, contribute to reading comprehension over 
and above word reading and listening comprehension? To our knowledge, no previous studies have 
examined whether online comprehension monitoring behavior, or time spent looking at implausible 
target words and reinspecting the sentence that contains the implausible words, makes an additional 
contribution to reading comprehension over and above word reading and listening comprehension. 
Note that word reading in the simple view of reading is the ability to read or decode individual words 
out of context; thus, it was measured as such in the present study as well as in previous studies. 


Method 
Participants 


The present study draws on data from 319 second-grade children (50% girls; Mage = 7.33 years, 
SD = .52) in six schools in the southeastern region of the United States. These children were 
participants in a larger longitudinal study about reading development from Grades 1 to 3 (see 
Kim & Petscher, 2016, for a study on word prosody using data in Grade 1), but data from the 
beginning of second grade are utilized because this is when comprehension monitoring using eye 
tracking technology was assessed. Data from four children were excluded from the analysis because 
they did not follow instructions/could not read, showed random viewing patterns, or could not 
answer simple comprehension questions (see next). Approximately 61% of the children were White, 
25% were African American, 6% were Hispanic, 6% were multiracial, and 3% were Asian American. 
Only two children were classified as having limited English proficiency. Approximately 53% of the 
children were eligible for free and reduced-price lunch. According to the school district record, 
approximately 14% of the children received speech services, and 2% of the children received services 
related to language impairment and learning disabilities. 


Measures 


Comprehension monitoring 

Materials. Children were presented with short stories consisting of four sentences, with each 
sentence presented as one line of text. Items were constructed so that the first sentence provided 
an opening statement with general background information. The second sentence presented a key 
statement with one critical concept specifying a specific sematic relation such as an object, time, 
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instrument, or attribute. The third sentence served as a filler, intended to increase working memory 
load, which fit the context but had no direct relationship to the statements made before. Critically, 
the final sentence continued the action or situation described in sentence 2, such that the argument 
of the critical sematic relation was either consistent or inconsistent with the one used in sentence 2. 
For example, when the initial statement (sentence 2) read, “He always wears his shorts to play 
outside,” the critical continuation (sentence 4) would be “This morning, Danny put on his shorts/ 
pants and went outside.” Note that the last sentence is in both versions (shorts or pants) correct by 
itself. However, to detect an inconsistency in the case of the “pants” version, the reader needs to be 
sensitive to the fact that the most recent object was not consistent with the one seen before, 
indicating monitoring for coherence in the accumulating semantic representation on the story 
level (see Rinck, Gamez, Diaz, & De Vega, 2003, for a similar approach for adults; see Appendix 
A for sample consistent and inconsistent stories). 

Twenty-one stories were presented to children, including one practice story, 10 stories with 
consistent information, and 10 stories with inconsistent information. Two counterbalanced lists 
were created such that the consistent and inconsistent versions of each item were read by different 
participants. When creating the items, priority was given to variations in the semantic (in)consis- 
tency over the lexical properties of the target word such as word length and frequency. This resulted 
in slight differences in word properties for the target word N, as well as words N + 1 and N + 2. 
Mean word lengths and frequencies (SubtLEXus) for consistent versus inconsistent words are 
presented in Table 1. Note that words in the inconsistent condition were, on average, shorter and 
had higher frequency values than words in the consistent condition. As shorter words and words 
with higher frequencies generally receive shorter viewing times (countering our hypothesized effect), 
any effect indicating longer times on the inconsistent words would make the case for an incon- 
sistency effect even stronger. 

To promote reading for meaning, children were asked five simple recall questions after randomly 
selected stories (e.g., What is the boy’s name?). Questions were never related to the inconsistencies, 
and responses to these questions were not included in the analysis because they were not intended to 
capture individual differences in deep processing of meaning but rather to ensure reading for 
meaning. 


Apparatus and procedure. Texts were presented on a 21-in. monitor with a screen resolution of 
1024 x 768 pixels. Courier New font in 20-point size was used, and viewing distance was adjusted so 
that one letter corresponded to .33 degree of visual angle. Texts were presented in black color on a 
gray background with double line spacing. Each short story for the inconsistency detection task was 
presented on a separate screen. Children were instructed to “read the text silently, so that you 
understand the content and are able to answer comprehension questions.” Before the child read a 
story, the recording system was calibrated to ensure optimal measurement accuracy using a 9-point 
calibration and validation routine. Immediately prior to text presentation, an additional drift 
correction check was performed to ensure identical start positions for the eyes during the onset of 
each story. If deviations larger than .5 degree of visual angle were detected, the camera system was 


Table 1. Descriptive Word Properties: Mean Word Length (Letters) and Word Frequency (SubtLEXus Ig10WF) for Consistent and 
Inconsistent Sentences 


Word Length Word Frequency 
Region Condition M SD M SD 
Word N Consistent 6.0 0.9 2.6 0.8 
Inconsistent 5 1.4 3.1 0.9 
Word N + 1 Consistent 4.5 2.4 49 0.8 
Inconsistent 3.4 1.7 49 1.1 
Word N + 2 Consistent 4.1 1.2 47 0.9 


Inconsistent 3.0 0.9 5.6 0.9 
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recalibrated. Eye movements were tracked with an EyeLink1000 desktop-mounted system with 
500 Hz sampling rate. Viewing and recording were binocular, but only data from the right eye 
were included in data analysis. Eye-movement data were processed using custom build EyeMap 
software (Tang, Reilly, & Vorstius, 2012) and SPSS. 


Eye movement variables. For the current study we focused on two temporal eye-movement 
parameters: first-pass reading time (gaze duration) and rereading time on three words in sentence 
4, namely, the target word (N) as well as the word immediately following the target word (N + 1) 
and the second next word (N + 2). We examined words N + 1 and N + 2 separately in addition 
to word N in order to capture possible spillover effects from the target word (Kliegl, Nuthmann, 
& Engbert, 2006). First-pass reading time was calculated by summing up all durations of fixations 
on a respective word before leaving the word for the first time. Rereading time for each word was 
calculated by summing up all durations of fixations on the respective word after first-pass 
reading; therefore, words that were fixated on in first-pass reading but were not reread were 
assigned a rereading time of zero. As laid out earlier, first-pass reading time and rereading time 
are indicators of different ongoing processes, namely, decoding (first-pass reading time) and later 
integration processes, which are both relevant to comprehension monitoring (see Inhoff & 
Radach, 1998). 

In addition to these word-based measures, we included sentence-level parameters to capture 
children’s extended monitoring behavior after encountering the inconsistency by calculating the time 
spent on sentence 4 (in which the inconsistent word occurred) after first encountering the incon- 
sistency. The same measure was also calculated for sentences 1-3, combining fixations on these 
preceding lines of text, as not all children looked back to all preceding sentences (see next). As 
accumulated viewing times on these sentences after first encountering the inconsistency reflect 
efforts to confirm and/or solve the inconsistency, we included Os in the calculation of means for 
these parameters. Hence, along with cases where actual rereading took place, cases with no rereading 
were represented in the means of these sentence-level parameters. 


Word reading. The following three tasks were used to measure word reading: the Letter Word 
Identification subtask of the Woodcock-Johnson III (WJ; Woodcock, McGrew, & Mather, 2001), the 
Word Reading subtask of the Wechsler Individual Achievement Test-III (WIAT; Wechsler, 2009), 
and the Sight Word Efficiency subtask of the Test of Word Reading Efficiency-II (TOWRE; Wagner, 
Torgesen, & Rashotte, 2012). In these tasks, the child was asked to read aloud (isolated) words of 
increasing difficulty. The first two tasks were untimed, whereas TOWRE was a timed task (45 s). 
Cronbach’s alpha estimates were .95 and .91 for the WJ Letter Word Identification and WIAT Word 
Reading tasks, respectively. Test-retest reliability for the TOWRE task was reported as .93 for 6- to 
7-year-olds (Wagner et al., 2012). 


Listening comprehension. Two tasks were used: the Listening Comprehension Scale of the Oral and 
Written Language Scales-II (OWLS; Carrow-Woolfolk, 2011) and the Oral Comprehension subtest 
of the WJ (Woodcock et al., 2001). In the OWLS Listening Comprehension task, the child was asked 
to point to the picture that best describes the heard sentences and connected texts (e.g., short 
stories). In the WJ Oral Comprehension subtest, the child was asked to complete the heard sentences 
(e.g., People sit in ) and short paragraphs. Cronbach’s alpha estimates were .93 and .74 for the 
OWLS Listening Comprehension and WJ Oral Comprehension tasks, respectively. 


Reading comprehension. The following two tasks were used: the Passage Comprehension subtest of 
the WJ and Reading Comprehension subtest of the WIAT. The Passage Comprehension subtest is a 
cloze task where the child was asked to read sentences and short passages and fill in blanks. In the 
Reading Comprehension subtest, the child was asked to read passages and answer multiple-choice 
questions. Cronbach’s alpha estimates were .86 and .87 for the WJ Passage Comprehension and 
WIAT Reading Comprehension tasks, respectively. 
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Procedures 


Language and reading assessment 

Children were individually assessed in quiet spaces in several sessions in the fall. Assessors were 
rigorously trained and had to meet 99% reliability in a fidelity check before they were allowed to 
work with children. 


Data analysis 


To address the question about whether beginning readers detect inconsistencies, we compared 
consistent and inconsistent stories with respect to reading times separately for three words of 
interest in sentence 4, namely, the target word (N), word N + 1, and word N + 2. In addition, we 
compared sentence reading times (for sentence 4 and sentences 1-3) after encountering the focal 
words of interest. Repeated measures analyses of variance with the factor, consistency, were used for 
statistical testing.’ 

Raw data from the eye tracker were transformed into csv-files using EyeMap (Tang et al., 2012). 
These files were then read into SPSS and cleaned. From a total of 353,416 fixations from 319 
participants, 14,765 fixations were located on words N, N + 1, and N + 2 combined. Fixations 
shorter than 80 ms (2.3%) or longer than 2,000 ms (0.1%) were excluded from further analyses. 
Participants that contributed fewer than 20 trials (or items) were also eliminated from further 
analyses (19 participants, with a total of 237 fixations on target words; 1.7%), resulting in a final 
set of 14,528 fixations for word-based analyses and 339,686 overall fixations. 

Before fitting structural equation models (SEMs) for the second and third research questions, eye- 
movement variables were log transformed due to slight skewness (see Table 3; also see Appendix B 
for an example’). Furthermore, measurement models were fitted to create the following latent 
variables: comprehension monitoring, word reading, listening comprehension, and reading 
comprehension. 

To address the second research question, word reading and listening comprehension were 
hypothesized to predict comprehension monitoring (Figure 1). To address the third research 
question, two models were fitted. In the first model, word reading and listening comprehension 
were included as predictors of reading comprehension to examine the simple view of reading. In the 
second model, comprehension monitoring was added to examine its unique contribution over and 
above word reading and listening comprehension. Following conventions in the field, model fit was 
evaluated by widely used multiple indices, including the comparative fit index (CFI; > .90 as 
acceptable), Tucker—Lewis index (TLI; > .90 as acceptable), and root mean square error of approx- 
imation (RMSEA) and standardized root mean square residual (SRMR; < .10 as acceptable; Hooper, 
Coughlan, & Mullen, 2008; Hu & Bentler, 1999; Kline, 2013). 


Results 
Descriptive statistics and preliminary results 


Table 2 presents descriptive statistics for listening comprehension, word reading, and reading 
comprehension measures. Mean standard scores of the language and reading skills indicate that 
children’s mean performance is in the average range on all the measures compared to that of a norm 
group. For instance, the mean standard score for the OWLS Listening Comprehension was 106.60 
(SD = 13.12) and for the WIAT Reading Comprehension was 100.36 (SD = 14.18). 


‘Note that using linear mixed effects models yielded identical results. 

?Log transformed values are not particularly different from the original raw values for rereading time for target implausible word 
N, word N + 1, and word N + 2 (see Table 3). This is primarily due to the zero values as shown in Appendix B. Given that the Os 
were true values, the results reported in the text are from data including 0s. 
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Figure 1. Standardized path regression weights of the model in which comprehension monitoring is predicted by word reading 
and listening comprehension. Note. Dark lines represent statistically significant relations. Gray lines represent covariances. 
WJ = Woodcock-Johnson Ill; LWID = Letter Word Identification; WIAT = Wechsler Individual Achievement Test-lll; 
TOWRE = Test of Word Reading Efficiency-ll; OC = Oral Comprehension; OWLS = Oral and Written Language Scales-ll; 
Target = target inconsistent words. 


Table 2. Descriptive Statistics for Listening Comprehension, Word Reading, and Reading Comprehension 


Variable M SD Minimum Maximum Skewness Kurtosis 
Listening Comprehension 
WJ Oral Comprehension 16.31 3.57 6.00 26.00 —0.03 —0.12 
WJ Oral Comprehension SS 106.00 12.71 71.00 140.00 —0.16 -0.21 
OWLS Listening Comprehension 83.75 11.68 46.00 111.00 —0.35 0.38 
OWLS Listening Comprehension SS 106.60 13.12 65.00 135.00 —0.47 0.35 
Word Reading 
WJ Letter Word Identification 41.94 6.62 24.00 62.00 0.20 0.44 
WJ Letter Word Identification SS 108.58 12.08 73.00 135.00 —0.61 0.29 
WIAT Word Reading 28.10 10.51 2.00 56.00 —0.04 —0.04 
WIAT Word Reading SS 103.36 15.26 63.00 143.00 —0.37 —0.23 
TOWRE Sight Word 51.82 13.52 4.00 80.00 —0.55 —-0.20 
TOWRE Sight Word SS 103.96 16.57 55.00 137.00 —0.59 —0.26 
Reading Comprehension 
WJ Passage Comprehension 22.76 4.79 12.00 35.00 —0.03 —0.62 
WJ Passage Comprehension SS 100.18 11.94 60.00 126.00 —0.53 0.22 
WIAT Reading Comprehension 25.58 8.63 0.00 40.00 —0.57 -0.39 
WIAT Reading Comprehension SS 100.36 14.18 46.00 149.00 —0.07 0.91 


Note. WJ = Woodcock-Johnson Ill; SS = standard score; OWLS = Oral and Written Language Scales-ll; WIAT = Wechsler Individual 
Achievement Test-Ill; TOWRE = Test of Word Reading Efficiency-ll. 


Research Question 1: Are beginning readers sensitive to inconsistencies in written texts? 


For the analysis of sensitivity to inconsistencies, descriptive statistics for first-pass reading times and 
rereading times (Table 3) for different regions are presented. For first-pass reading times (i.e., gaze 
duration), we found an inconsistency effect, with longer reading times (+ 99 ms) on the inconsistent 
compared to consistent target word (word N), F(1, 299) = 42.21, p < .001. On the word immediately 
following the target word (word N + 1), the difference in first-pass reading times between consistent 
and inconsistent conditions was 15 ms and was not statistically significant, F(1, 299) = 1.93, p = .17. 
On the second word following the target word (word N + 2), we found a reversed inconsistency 
effect for first-pass reading (- 70 ms), F(1, 299) = 44.57, p < .001, which we attribute to the word 
properties (see Table 1), as words in the inconsistent condition were shorter and more frequent 
compared to those in the consistent condition. 

With respect to rereading times, target words received significantly longer rereading fixations (+ 
251 ms) in the inconsistent compared to consistent condition, F(1, 299) = 106.81, p < .001. The same 
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Table 3. Eye Movement Measures: Mean First-Pass Times and Rereading Times (ms) for Key Regions in Consistent and Inconsistent 
Target Sentences 


Region Condition M SD Minimum Maximum Skewness Kurtosis 


First-pass times 
(gaze duration) 


Word N Consistent 508 218 181 1522 1.51 3.12 
Inconsistent 607 290 223 2233 1.73 4.78 
Inconsistent log 6.31 43 5.41 771 33 —.14 
Word N + 1 Consistent 471 184 163 1327 1.24 2.05 
Inconsistent 486 211 197 1908 1.92 743 
Inconsistent log 6.11 39 5.28 7.55 32 .07 
Word N + 2 Consistent 499 175 166 1281 1.11 1.63 
Inconsistent 429 163 144 1309 1:35 3.09 
Inconsistent log 5.99 36 4.97 7.18 .20 —.05 
Rereading times 
Word N Consistent 315 292 0 1780 2.10 5.37 
Inconsistent 566 479 0 3738 2.46 10.87 
Inconsistent log 5.93 1.18 0 8.23 —2.56 10.63 
Word N + 1 Consistent 259 242 0 2014 241 10.62 
Inconsistent 351 328 0 3065 3.20 19.24 
Inconsistent log 5.36 1.34 0 8.03 —2.15 6.35 
Word N + 2 Consistent 269 254 0 1532 2.33 7.53 
Inconsistent 324 248 0 1612 1.70 4.82 
Inconsistent log 5.29 1.47 0 7.39 —2.53 6.75 
Sentence 4 Consistent 1516 774 377 7174 2.99 15.56 
Inconsistent 1811 854 341 6399 1.60 4.46 
Inconsistent log 740 46 5.83 8.76 -.31 92 
Sentences 1-3 Consistent 1591 3000 0 31612 4.92 37.02 
Inconsistent 1741 3051 0 29728 4.61 31.66 
Inconsistent log 5.60 2.92 0 10.30 —1.02 —.20 


Note. Total reading times for sentences include only fixation durations that occurred after the target word was fixated on and do 
not include viewing times on words N, N + 1, and N + 2. log = log transformation. 


pattern was found for word N + 1 (+ 92 ms), F(1, 299) = 28.91, p < .001, and word N + 2 (+ 55 ms), 
F(1, 299) = 13.78, p < .001. It is important to note here that these elevated rereading times occurred 
despite the fact that words in the inconsistent condition were shorter and had higher frequency and, 
thus, were expected to be easier to process compared to the consistent counterparts. 

Regarding rereading of sentences after encountering the target word, we found that rereading 
times for sentence 4 were significantly higher (+ 295 ms) in inconsistent compared to consistent 
stories, F(1, 299) = 67.18, p < .001. No significant difference was found for rereading times on 
sentences 1-3 after encountering the target word, F(1, 299) = 2.09, p = .15. 

Note that there was a slight floor effect in the rereading time on sentences 1-3 such that, out of 
300 children, 61 (20% of the sample) did not spend any time at all on the first three sentences (i.e., 
no reinspections of sentences 1-3). 

Additional analyses showed that rereading behavior did not change as a function of time on task. 
This was an important manipulation check, as item construction was very similar across the task. 
However, this did not seem to have an influence on children’s rereading behavior, at least not across 
the 20 trials (or items) used in this study. To further examine how children allocated their time when 
rereading, we performed scan path analyses for all fixations after encountering the target word 
region. Visual comparison of the results revealed that children in second grade do not yet show a 
stable, clear scanning pattern. Instead, we found full rereading, partial rereading, backward scanning, 
and sometimes even random scanning across and within children. 

Overall, the data indicate that second graders at the beginning of the academic year are able to 
detect inconsistencies in short stories, although their scanning pattern is not clearly established 
yet, and that their efforts to resolve such inconsistencies are mostly restricted to the text region 
where the inconsistency was elicited. In this region, additional processing indicative of 
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comprehension monitoring continues while fixating on the following word, creating a substantial 
total effect. 


Research Question 2: Are word reading and listening comprehension related to 
comprehension monitoring? 


Prior to fitting SEMs, latent variables were created for the following constructs: word reading, 
listening comprehension, reading comprehension, and comprehension monitoring using variables 
described above. Loadings were all adequate, ranging from .62 to .96 (ps < .001; see Figures 1 and 2). 
Comprehension monitoring was indicated by rereading time for target word N, target word N + 1, 
target word N + 2, and sentence 4 in inconsistent stories. The loading for the rereading time on 
sentences 1-3 was low (.15, p = .01), most likely due to the floor effect; therefore, it was not included 
as an indicator of comprehension monitoring in subsequent SEMs (the results were essentially 
identical when it was retained). Finally, confirmatory factor analysis was conducted to examine 
whether gaze duration (which is hypothesized to primarily tap into decoding processes) and 
rereading time for target word N, target word N + 1, target word N + 2, and sentence 4 (which 
are hypothesized to tap into comprehension monitoring) are best described as two dissociable latent 
variables or a single latent variable. Results showed that the two-factor model was superior 
(Ay = 119.81, Adf = 1, p < .001), and, therefore, in the subsequent structural models, rereading 
times were used as indicators of comprehension monitoring. 

Bivariate correlations between rereading time indicators, listening comprehension, word reading, 
and reading comprehension are reported in Table 4. Rereading time indicators were moderately 
related to one another (.41 < rs < .50). Relations between listening comprehension, word reading, 
and reading comprehension ranged from moderate to strong (.35 < rs < .91). Relations of rereading 
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Reading C 
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Reading 
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Figure 2. Standardized path regression weights of the model in which reading comprehension is predicted by word reading, 
listening comprehension, and comprehension monitoring. Note. Solid lines represent statistically significant relations. Dashed lines 
represent statistically non-significant relations. Gray lines represent covariances. WJ = Woodcock-Johnson Ill; LWID = Letter Word 
Identification; WIAT = Wechsler Individual Achievement Test-lll; TOWRE = Test of Word Reading Efficiency-ll; OC = Oral 
Comprehension; OWLS = Oral and Written Language Scales-ll; Target = target inconsistent words; C = Comprehension. 
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Table 4. Bivariate Correlations Between Rereading Times (Transformed Values), Listening Comprehension, Word Reading, and 
Reading Comprehension 


1 2 3 4 5 6 7 8 9 10 
1. Reread duration word N 1 
2. Reread duration word N + 1 50*** 1 
3. Reread duration word N + 2 A4te* Alt 1 
4. Reread of sentence 4 after reading AB*** AGH A5*** 1 
target region 
5. WJ Oral Comprehension 0.05 0.07 —0.02 —.15** 1 
6. OWLS Listening Comprehension 0.11 0.10 0.07 —0.06 64*** 1 
7. WJ Letter Word Identification 0.03 0.07 0.04 0***  A4AHee  AQ*HE 1 
8. WIAT Word Reading 0.00 0.07 0.04 ORE ATER AGREE 9] HH 1 
9. TOWRE Sight Word Efficiency 0.10 .19** .12* AAERE 35*** 388" (80*** .81*** 1 
10. WJ Passage Comprehension 0.04 0.00 —0.01 —.20*** = 5QO***  63** 70K 72K 66 
11. WIAT Reading Comprehension 0.06 —0.01 0.01 =23"** (S6*** SE * BOF 798F* 72*8* 77 * 


Note. WJ = Woodcock-Johnson Ill; OWLS = Oral and Written Language Scales-ll; WIAT = Wechsler Individual Achievement Test-lll; 
TOWRE = Test of Word Reading Efficiency-ll. 
*p < .05. **p < .01. ***p < .001. 


times with listening comprehension, word reading, and reading comprehension ranged from null to 
moderate (.00 < rs < .44). 

The SEM model had an adequate fit to the data, x (24) = 106.62, p < .01, CFI = .95, TLI = .92, 
RMSEA = .10, SRMR = .08; results are shown in Figure 1. Word reading was moderately and 
negatively related to online comprehension monitoring (6 = —.31, p < .001), suggesting that children 
with lower word reading proficiency spent greater time rereading target words and the sentence 
containing target words (i.e., sentence 4). Once word reading was held constant, listening compre- 
hension was positively related to online comprehension monitoring (6 = .22, p = .01). 


Research Question 3: Is comprehension monitoring related to reading comprehension after 
accounting for word reading and listening comprehension? 


When reading comprehension was predicted by word reading and listening comprehension (i.e., simple 
view of reading), model fit was excellent, x (11) = 31.94, p < .001, CFI = .99, TLI = .98, RMSEA = .08, 
SRMR = .02. Word reading (B = .64, p < .001) and listening comprehension (6 = .46, p < .001) were both 
independently related to reading comprehension and explained a total of 96% of variance in reading 
comprehension. When comprehension monitoring was included as an additional predictor, the model 
fit was good, x” (38) = 137.35, p < .01, CFI = .96, TLI = .94, RMSEA = .09, SRMR = .08. However, after 
accounting for word reading and listening comprehension, online comprehension monitoring was not 
related to reading comprehension’ (B = .04, p = .29; see Figure 2). 


Discussion 


Our goal was to investigate the nature and variability of online comprehension monitoring, its 
predictors, and its relation to reading comprehension for beginning readers. We found that second 
graders at the beginning of the school year were sensitive to inconsistencies in short stories such that 
they spent greater time rereading implausible target words than control words. These results are in 
line with previous studies with older children (e.g., Connor et al., 2014; van der Schoot et al., 2012) 
and highlight that such monitoring processes are in place early in reading development, which is 
consistent with an earlier study of online comprehension monitoring in Finnish first graders 
(Kinnunen et al., 1998). Furthermore, children, on average, spent more time looking back at the 
sentence that contained an implausible target word (sentence 4) than the prior sentences (sentences 


3Results were essentially the same when the gaze duration latent variable was included as an additional predictor—neither gaze 
duration nor comprehension monitoring (rereading times) were statistically significant (ps > .41). 
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1-3). Lookbacks to preceding sentences were not found for quite a few children (20%), and, for 
many others, occurred only a couple of times, on average. Furthermore, lookbacks to preceding 
sentences had a weak relation to comprehension monitoring (i.e., weak loading), and post hoc 
analysis revealed that lookbacks to preceding sentences were not related to word reading or reading 
comprehension in bivariate correlations (.00 < rs < .04). These results suggest that at this early stage 
of reading development, lookbacks to previous text may be an indicator of a more deliberate strategy 
to resolve inconsistencies over and above the mechanism of detecting them. 

In the structural analyses, as hypothesized, word reading proficiency was negatively related to 
comprehension monitoring such that low reading skill is associated with increased time spent 
rereading text. This indicates that time spent rereading and lookbacks captures not only comprehen- 
sion-related processes but also decoding-related processes to some extent, at least for beginning 
readers (ie., second graders). Important to note, though, when this constraining role of word 
reading was accounted for, listening comprehension was positively related to comprehension 
monitoring (Figure 1). This result is in line with a previous study, in which an oral language latent 
variable composed of vocabulary and retell was related to comprehension monitoring in fifth graders 
(Connor et al., 2014), and indicates that meaning construction and integration processes in oral 
language (Kim, 2016, 2017; Kim & Phillips, 2014) predict comprehension monitoring in reading 
contexts (see also van der Schoot et al., 2012). 

When online comprehension monitoring was examined simultaneously with the two well-known, 
powerful predictors of reading comprehension—word reading and listening comprehension—it was 
not uniquely related to reading comprehension. In line with previous studies using latent variable 
approaches (e.g., Adolf et al., 2006; Kim, 2015, 2017; Silverman, Speece, Harring, & Ritchey, 2013), 
word reading and listening comprehension explained a vast majority (ie., 96%) of variance in 
reading comprehension in the present study. Recent findings suggest that the strong explanatory 
power of word reading and listening comprehension can be attributed to the fact that word reading 
and listening comprehension encompass a multitude of component skills of reading comprehension, 
such as oral language (e.g., vocabulary), low-level cognition (e.g., working memory), and higher 
order cognition (e.g., inference, monitoring; Kim, 2015, 2017). The present finding of no direct 
relation of online comprehension monitoring to reading comprehension indicates that that the 
relation of comprehension monitoring to reading comprehension is indirect via word reading and 
listening comprehension (Kim, 2015, 2017), which adds to growing evidence by measuring online 
comprehension monitoring using local rereading time and lookbacks in a normal reading context, 
whereas previous work examined comprehension monitoring using tasks that demanded a deliberate 
effort to find anomalies in text (e.g., Cain et al., 2004; Kim, 2015, 2017; Kim & Phillips, 2014; Oakhill 
& Cain, 2012). To our knowledge, this is the first study that investigated the relation of online 
comprehension monitoring to reading comprehension after accounting for word reading and 
listening comprehension. 

It will be important for future research to determine whether comprehension monitoring as 
measured in previous work (i.e., deliberate strategies) and online comprehension monitoring as 
measured in the present study are expressions of the same monitoring process, or whether they 
represent different control mechanisms that good readers should have at their disposal (see Kim & 
Phillips, 2014, for a discussion of the strategy vs. skill continuum of comprehension monitoring). 
Although the assumption is that the strategy can be trained to evolve into a skill (Afflerbach, 
Pearson, & Paris, 2008), our data suggest that detecting inconsistencies in written text is already 
found at the beginning of second grade, as evidenced by the longer rereading duration on target 
words and the sentence containing target words. One assumption that should be examined in future 
research is that with growing reading ability, readers will look back at critical preceding text regions 
(e.g., sentences 1-3 in this study) more often. For more skilled readers, these lookbacks should focus 
on disambiguating regions and ultimately result in better comprehension. A study with older 
children at a more advanced reading level would illuminate this question. Furthermore, this process 
of looking back at the disambiguating regions could certainly be subject to training by intervention. 
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Although the focus of the present study was comprehension monitoring in the context of reading, 
a body of literature has shown that comprehension monitoring emerges from the meaning con- 
struction process in oral language contexts (Baker, 1984; Kim, 2015; Kim & Phillips, 2014; Markman, 
1977; Revelle et al., 1985; Skarakis-Doyle, 2002) and can be effectively taught in oral language 
contexts even for prekindergartners (Kim & Phillips, 2016). Therefore, explicit and systematic efforts 
to develop children’s comprehension monitoring, whether in written text or oral text contexts, would 
be important in instruction. In particular, given the essential role of oral language in reading 
(Hulme, Nash, Gooch, Lervag, & Snowling, 2015; Kim, 2017) and the present finding of a relation 
between listening comprehension and comprehension monitoring, explicit early instruction on 
comprehension monitoring in oral language contexts, as well as reading contexts, appears to be a 
reasonable recommendation. This could promote development of reading comprehension even in 
primary grades, where decoding instruction takes priority (e.g., Pearson & Duke, 2002). 

Reading is a complex phenomenon, requiring coordination of multiple processes and demands. In 
the present study, we focused on one of the processes—comprehension monitoring—and found that 
even beginning readers are sensitive to semantic inconsistency in written texts. However, online 
comprehension monitoring (indicated by greater time spent rereading inconsistent words and texts) 
was not associated with reading comprehension once word reading and listening comprehension were 
accounted for. Nonetheless, the present findings have shown the importance of studying moment-to- 
moment information processing during reading for a better understanding of this complex process. 
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Appendix A 


Table A1 Examples of Inconsistent and Consistent Stories 


There was a rabbit named Fluffy. 

Fluffy’s food was always carrots. 

He often hopped around in a garden. 

Fluffy ate cabbage for his meals. 

George is a very happy farmer. 

On his farm, George grows only pumpkins. 
His farm has a large green field. 

George loves growing pumpkins on his farm. 


Inconsistent stories 


Consistent stories 
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Shelly is a cheerful little girl. 

She likes to wear only blue clothes. 

In her closet there are many dolls and toys. 
Today Shelly picked a yellow shirt and pants. 
Seahorses live in oceans and seas. 

They are very slow swimmers. 

Usually they stay in shallow waters. 
Seahorses move slowly through the water. 


Note. Italics indicate embedded target inconsistent word, shown for demonstration purposes here. Italics were not used during 


data collection. 


Appendix B 


Distribution of Rereading Time for Target Words Before (Left) and After (Right) Log Transformation 
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